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Abstract. We consider the two-dimensional range maximum query (2D- 
RMQ) problem: given an array A of ordered values, to pre-process it so 
that we can find the position of the largest element in a (user-specified) 
range of rows and range of columns. We focus on determining the effec- 
tive entropy of 2D-RMQ, i.e., how many bits are needed to encode A so 
that 2D-RMQ queries can be answered without access to A. We give tight 
upper and lower bounds on the expected effective entropy for the case 
when A contains independent identically-distributed random values, and 
new upper and lower bounds for arbitrary A, for the case when A con- 
tains few rows. The latter results improve upon upper and lower bounds 
by Brodal et al. (ESA 2010). We also give some efficient data structures 
for 2D-RMQ whose space usage is close to the effective entropy. 



1 Introduction 

In this paper, we study the two-dimensional range maximum query problem 
(2D-RMQ). The input to this problem is a two dimensional m hy n array A 
oi N = m-n elements from a totally ordered set. We assume w.l.o.g. that m < n 
and that all the entries of A are distinct (identical entries of A are ordered lexico- 
graphically by their index). We consider queries of the following types. A 1-sided 
query consists of the positions in the array in the range q = [1 ■ ■ ■ m] x [1 ■ • • j], 
where 1 < j < n. (For the case m = 1 these may also be referred to as pre- 
fix maximum queries.) For a 2-sided query the range is q = [I ■ ■ ■ i] x [1 • • • j], 
where 1 < i < m and I < j < n; for a 3-sided query, q — [1 ■ ■ ■ i] x [ji ■ ■ • J2], 
where 1 < i < m and 1 < ji < J2 ^ n and for a 4-sided query, the query range 
is q — [ii ■ ■ ■ 12] X [ji ■ ■ ■ J2], where 1 < ii < *2 < 'ti and 1 < Ji < J2 < n. In each 
case, the response to a query is the position of the maximum element in the 
query range, i.e., RMQ(yl, q) — argmax^j j-)gq74[i, j']. If the number of sides is not 
specified we assume the query is 4-sided. 
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We focus on the space requirements for answering this query in the encoding 
model [4] , where the aim is to pre-process A and produce a representation of A 
which ahows 2D-RMQ queries to be answered without accessing A any further. 
We now briefly motivate this particular question. Lossless data compression is 
concerned with the information content of data, and how effectively to com- 
press/decompress the data so that it uses space close to its information content. 
However, there has been an explosion of interest in operating directly (without 
decompression) on compressed data via succinct or compressed data structures 
[12, 8]. In such situations, a fundamental issue that needs to be considered is the 
"information content of the data structure," formalized as follows. Given a set of 
objects S, and a set of queries Q, consider the equivalence class C on 5 induced 
by Q, where two objects from S are equivalent if they provide the same answer 
to all queries in Q. Whereas traditional succinct data structures are focussed 
on storing a given x G S using at most [log |S'|] bits'' — the entropy of S, we 
consider the problem of storing x in [log [CI] bits — the effective entropy of S 
with respect to Q — while still answering queries from Q correctly. In what 
follows, wc will abbreviate "the effective entropy of S with respect to Q" as "the 
effective entropy of Q." 

Although this term is new, the question is not: a classical result, using Carte- 
sian trees [22], shows that given an array A with n values from {!,... ,n}, only 
2n — O(logn) bits are required to answer ID-RMQ without access to A, as 
opposed to the 0{n\ogn) bits needed to represent A itself. The low effective 
entropy of ID-RMQ is useful in many applications, e.g. it is used to simulate 
access to LCP information in compressed suffix arrays (sec e.g. [19]). This has 
motivated much research into data structures whose space usage is close to the 
2n — O(logn) lower bound and which can answer RMQ queries quickly (see 
[9] and references therein) . In addition to being a natural generalization of the 
ID-RMQ, the 2D-RMQ query is also a standard kind of range reporting query. 

Previous Work. The 2D-RMQ problem, as stated here, was proposed by Amir 
et al. [1]. (The variant where elements are associated with a sparse set of points 
in 2D, introduced by [10], is fundamentally different and is not discussed further 
here.) Building on work by Atallah and Yuan [2], Brodal et al. [4] gave a hybrid 
data structure that combined a compressed "index" of 0{N) bits along with the 
original array A. Queries were answered using the index along with 0(1) ac- 
cesses to A. They showed that this is an optimal point on the trade-off between 
the number of accesses and the memory used. In contrast, Brodal et al. refined 
Demaine et al.'s [6] earlier lower bound to show that the effective entropy of 2D 
RMQ is f2{Nlogm) bits, thus resolving in the negative Amir et al.'s open ques- 
tion regarding the existence of an 0(iV)-bit encoding for the 2D- RMQ problem. 
Brodal et al. also gave an 0{N min{TO, logn}) bit encoding of A. Recalling that 
m is the smaller of the two dimensions, it is clear that Brodal et al.'s encoding 
is non-optimal unless m = n^^^\ 



All logarithms are to base 2 unless stated otherwise. 



Our Results. We primarily consider two cases of the above problem: (a) the 
random case, where the input A comprises N independent, uniform (real) ran- 
dom numbers from [0,1), and (b) the case of fixed m, where A is worst-case, 
but m is taken to be a (fairly small) constant. Random inputs are of interest 
in practical situations and provide insights into the lower bounds of [6, 4] — for 
instance, we show that the 2D-RMQ can be encoded in 0{N) expected bits as 
opposed to f2{N log m) bits for the worst case — that could inform the design of 
adaptive data structures which could use significantly less space for practical in- 
puts. For the case of fixed m, we determine the precise constants in the effective 
entropy for particular values of m — applying the techniques of Brodal et al. 
directly yields significantly non-optimal lower and upper bounds. These results 
use ideas that may be relevant to solving the asymptotic version of the problem. 
The majority of our effort is directed towards determining the effective entropy 
and providing concrete encodings that match the effective entropy, but we also in 
some cases provide data structures that support range maximum queries space- 
and time-efficiently on the RAM model with logarithmic word size. 

Effective Entropy on Random Inputs. We first consider the ID-RMQ problem 
for an array A[l ■ ■ - n] (i.e. to = 1) and show that, in contrast to the worst case 
lower bound of 2n — O(logn) bits, the expected effective entropy of RMQ is < cn 
bits for c « 1.736 . . ., where the precise value of c equals 2 X^i^i (7+T)frf2) • 
also give another encoding that is more "local" and achieves expected cn + o(n) 
bits for c< 1.9183.... 

In the 2D case, A is an m x n array with 2 < m < n. We show bounds on 
the expected effective entropy of RMQ as below: 



1-sided 


2-sided 


3-sided 


4-sided 


6)((logn)^) bits 


©((log n)-^ log to) bits 


6'(n(logTO)'^) bits 


0(nm) bits 



The 2D bounds are considerably lower than the known worst-case bounds of 
O(nlogTO) for the 1-sided case, 0{nm) for the 2-sided case, and known lower 
bounds of f2(nm) and J7(nTOlogTO) for the 3-sidcd and 4-sidcd cases respec- 
tively. The above results also hold in the weaker model where we assume all 
permutations of A are equally likely. We also give a data structure that supports 
(4-sided) RMQ queries in 0(1) time using expected 0{nm) bits of space. 

Effective Entropy for Small to. Our results for the 2D RMQ problem (4-sided 
queries) with worst-case inputs are as follows: 

1. We give an encoding based on "merging" Cartesian trees^. While this en- 
coding uses 0{nm?) bits, the same as that of Brodal et al. [4], it has lower 
constant factors: e.g., it uses 5n — O(logn) bits when to = 2 rather than 
In — O(logn) bits [4]. We also give a data structure for the case to = 2 that 
uses (5 -|- e)n bits and answers queries in 0(i2S^^) time for any e > 0. 



This encoding has also been discovered by Brodal (personal communication). 



2. We give a lower bound on the eflFective entropy based on "merging" Cartesian 
trees. This lower bound is not aymptotically superior to the lower bound of 
Brodal et al. [4], but for all fixed m < 2^^ it gives a better lower bound than 
that of Brodal et al. For example, we show that for m = 2, the effective 
entropy is 5n — 0(log n) bits, exactly matching the upper bound, but the 
method of Brodal et al. yields only a lower bound of n/2 bits. 

3. For the case m = 3, we give an encoding that requires (6+log 5)n— 0(log n) 
8.32n bits'\ Brodal et al."s approach requires (12+log 5)n— 0(log n) ss 14.32n 
bits and the method in (1) above would require about 9n bits. Our lower 
bound from (2) is 8n — O(logn) for this case. 

The paper is organized as follows: in Section 2 we give bounds on the expected 
entropy for random inputs, in Section 3 we consider the case of small m and 
Section 4 gives the new data structures. 

2 Random Input 

In this section we consider the case of "random" inputs, where the array A is 
populated with N — m ■ n independent uniform random real numbers in the 
range [0, 1). We first consider the ID-RMQ problem giving two encodings, one 
optimal but less convenient to decode (Theorem 1) and another that is less 
compact, but easier to decode (Theorem 2). We then consider the 2D cases, 
beginning with 1-sidcd queries (Theorem 3), 2-sided (Theorem 4) and finally 4- 
and 3-sided queries (Theorem 5). 

2.1 ID RMQ problems. 

For the ID case, we begin by outlining the Cartesian tree [22]. Given an array A 
containing n distinct numbers, its Cartesian tree is an unlabeled n-node binary 
tree in which each node corresponds to a unique position in the array, and is 
defined recursively as follows: the root of the tree corresponds to position i, where 
A[i\ is the maximum element in A, and the left and right subtrees of the root are 
the Cartesian trees for the sub-arrays ^[1 . . . ; — 1] and A[i + 1 . . . n] respectively 
(the Cartesian tree of a null array is the empty binary tree) . A Cartesian tree for 
an array A can be used to answer ID-RMQ on A via a lowest common ancestor 
query on the Cartesian tree. We first show: 

Theorem 1. The expected effective entropy o J 2- sided queries on a ID array of 
size n is at most an + O(logn) bits where a w 1.736 . . . has the exact value 
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All logarithms are to the base 2. 



Proof. We first give an encoding of a Cartesian tree. In what follows, we number 
the nodes of the Cartesian tree in in-order, so that node v corresponds to A\v\. 
Each node v is the root of a subtree of size > 1 (which represents a sub-array 
of A of size Sy). The relative offset Oy of v is an integer in the range 0..s„ — 1 
representing its relative position in this sub-array (if v has a left child x, then 
Ov = Sx, otherwise Oy =0). The encoding is obtained by visiting the nodes of the 
Cartesian tree in pre-order and writing down the values in the order they are 
encountered. It is not hard to sec; that the Cartesian tree can be uniqcly decoded 
from this encoding. If root denotes the root of the Cartesian tree, the first value 
in the sequence, which is Oroot, gives the sizes of the left and (since n is known) 
right subtrees: we obtain the sub-sequences corresponding to the subtrees and 
recur se. 

A naive implementation of this encoding requires 0{nlogn) bits since each 
relative offset potentially needs 0(log n) bits; even encoding each relative offset 
in [logOt,] bits may cause this encoding to exceed the known upper bound of 
2n — 0(1) bits. For a more space-efficient encoding, define the weight of a node 
V, Wy, as the product of for all nodes x in v's subtree, including v itself. The 
sequence of values o„, written in pre-order, is viewed as a mixed-radix integer, 
where the radix of Oy is s„, and the root and last node in pre-order are taken 
to be the least and most significant digit respectively. The value of the resulting 
integer is clearly in the range O..Wroot — ^- This integer is computed as follows. We 
traverse the Cartesian tree in pre-order. Suppose that w is a node with left child 
I and right child r. Before returning from v to v's parent, having completed the 
traversal of I (r) of a node v, and encoded the subtrees rooted at / (r) as integers 
ei (cr), we encode v as the integer e„ — CrWiSy + eiSy + o„ in the range O..Wy — 1. 
If the final encoding is e^, then Cr mod n = o,.\ the rest of the decoding can be 
done essentially by converting the recursive decoding algorithm described in the 
previous paragraph to an iterative one. Observe that the size of the encoding is 
llogWroot] = rS^logs«l bits; we now bound the latter quantity. 

For every i such that 0<i<n— 1, an array of random independent num- 
bers A will have its maximum element at position i + 1 with probability 1/n. 
Consequently, the associated distribution of Cartesian trees will witness a tree 
with exactly i nodes in its left subtree and n — 1 — i nodes in its right subtree 
with probability 1/n; the root will correspond to position i + Let S{n) denote 
the expected value of logSu for a random array A of size n, and note that 
[^(n)] is the expected size of the encoding. Taking S{0) = 0, then for all n > 1: 



- 11 — J. 

S{n) = logn+- ^(5'(i) +S{n-1- i)) 



i=0 




n-1 



(1) 



i=0 



In fact. Equation (1) is identical to the recurrence for the entropy of n-node 
random binary search trees [15], viz. counting the expected number of bits needed 
to describe a binary search tree produced by a random permutation of n distinct 



numbers. It has the exact solution 5(0) = and 



Tl-l 



logi 



S{n) = logn + 2(n + 1) X! 



(i + l)(i + 2) 



i=l 



for n > 1. The result follows. 



□ 



Although not a primary concern of this section, it should be noted that the above 

encoding appears to require O(n^) time to decode. A less compact encoding, 
which is more "local" and is linear-time decodablc is given below: 

Theorem 2. There is a linear-time decodable encoding of ID RMQ that uses 
on + o(n) bits for c = 1.9183 . . . < 1.92. 

Proof. We study the distribution of different kinds of nodes in the Cartesian tree 
of a random array. Each node in a Cartesian tree can be of four types - it can 
have two children (typc-2), only a left or right child (type-L/type-R), or it can 
be a leaf (type-0). Consider an element A[i] for 1 < « < n and observe that the 
type of the i-th node in the Cartesian tree in inorder (which corresponds to A[i]) 
is determined by the relative values of I = A[i — 1], m = A[i] and r = A[i + 1] 
(adding dummy random elements in A[0] and A[n + 1]). Specifically: 

1. ii r > m > I then node i is type-L; 

2. ii I > m > r then node i is type-R; 

3. ii I > r > m ov r > I > m then node i is type-0 and 

4. iim>l>rorm>r>l then node i is type-2. 

In a random array, the probabilities of the alternatives above arc clearly 1/6, 1/6, 
1/3 and 1/3. By linearity of expectation, if is the random variable that 
denotes the number of type-x nodes, we have that E[A^o] = E[A^2] = n/3 and 
E[A^l] = E[A^7j] = n/6. The encoding consists in traversing the Cartesian tree in 
either level-order or in depth-first order (pre-order) and writing down the label of 
each node in the order it is visited: it is known that this suffices to reconstruct the 
Cartesian tree [14, 3]. The sequence of labels is encoded using arithmetic coding, 
choosing the probability of type-0 and type-2 to be 1/3 and that of type-L and 
type-R to be 1/6. The coded output would be of size log6(A/'i{-|-A/'i:,)+log3(A^o + 
A^2) (to within lower-order terms) [13]; plugging in the expected values of the 
random variables A^^, gives the result. It is easy to see how to decode the tree 
from this encoding in linear time. □ 

2.2 2D RMQ problems. 

We now consider the 2D case. 

Theorem 3. The expected effective entropy of 1-sided queries on anmxn array 
is 0(log^ n) bits. 



Proof. For the upper bound observe that we can recover the answers to the 1- 
sided queries by storing the positions of the prefix maxima, i.e., those positions, 
(i, j), such that the value stored at (i, j) is the maximum among those in positions 
[1 • • • m] X [1 • • • j]. Since the position (i, j) is a prefix maximum with probability 
1/jm and can be stored using [log(nm + 1)] bits, the expected number of bits 
used is at most ^ X]j=i ( riog('^™ + 1)1 = 0(log^ n) bits. 

Consider a random source that generates n elements of {0, 1, ... , m} as fol- 
lows: the ith element of the source is j if the answer to the query [1 • • • m] x [1 • • • i] 
is (j, i) for some j, and otherwise. Clearly the entropy of this source is a 
lower bound on the expected size of the encoding. This source produces n in- 
dependent elements of {0,1,..., m} with the ith equal to j, j = 1. • ■ • , m, with 
probability 1/im and is equal to with probability 1 — I.e., its entropy is 
ELi [(1 - log(^) + Er=i log(zm)Am] = l2(log2 n) bits. □ 

Theorem 4. The expected effective entropy of 2-sided queries on anmxn array 
is 0{log^ n log m) bits. 

Proof. As in the proof of Theorem 3, we store a list of the positions of the 2- 
sidcd prefix maxima sorted by their values. By 2-sidcd prefix maxima we mean 
those positions {i,j) where the value in that position is maximum among all 
those in [1 • • • i] X [1 • • • j] . The answer to any query is the position of the largest 
such 2-sided prefix maximum inside the query. This can be determined from the 
sorted list of positions. The expected number of bits in the encoding is at most 
J2T=i J2"=i\^og{nm + 1)1 /ij = ©(log^nlogm) bits. 

The lower bound is also similar. From an encoding for 2-sidcd queries of an 
m X n array, we can create a source of nm independent bits with a bit being 1 if 
and only if the answer to the query [1 • • • i] x [1 • • • j] is {i,j) (which occurs with 
probability l/ij). The entropy of this source is at least El^i Ej=i log(«j)/«j = 
/2(log^ n log m) bits. □ 

Theorem 5. The expected effective entropies of 4- sided and 3-sided queries on 
an mx n array are 0{nm) bits and 0(n(logm)^) respectively. 

Proof. We begin with the 4-sided case^'^. For each position {i,j) we store a region 
which has the property that for any query containing and lying entirely 

within that region, {i,j) is the answer to that query. This contiguous region is 
delimited by a monotone (along columns or rows) sequence of positions in each 
of the quadrants defined by (i,i). A position (fc, /) delimits the boundary of the 
region of {i,j) if the value in position (A:, I) is the largest and the value in position 
{i,j) is the second largest, in the sub-array defined by {i,j) and {k,l), i.e., any 
query in this sub-array not including positions on row k or column / is answered 
with {i,j) (dealing with boundary conditions appropriately). The answer to any 
query is the (unique) position inside the query whose region entirely contains 
the query. For each position, we store a clockwise ordered list of the positions 
delimiting the region (starting with the position above i in column j) by giving 
the position's column and row offset from {i,j). 



NB: positions are given as row index then column index, not x-y coordinates. 
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Fig. 1. Regions for the italicised values for 4-sided queries (left) and 3-sided queries 
(right), with region delimiters in blue. 



The expected number of bits required to store any region is at most 4 times 
the expected number bits required to store the region of (1, 1), whose boundary 
runs diagonally from the first column to the first row. A position delimits 
the boundary of (1,1) (i = 1, . . . , to -I- 1, j — 1, . . . , n -I- 1 excluding the case 
i = j = 1) with probability 1 — 1) (if it contains the largest value and (1,1) 
is the second largest in the sub-array [1 ■ ■ - i] x [1 • • • j]) and its offsets require at 
most 2([log(z -I- 1)] -I- [log(j + 1)1 ) bits to store. I.e., the expected number of bits 
stored per position is at most 

fe 2riog(»-n)i '^^'y.^ 2(riog(» + i)uriogo- + i)i) \ _ 

By linearity of expectation, the expected number of bits stored is 0{nm). The 
bound is tight as we can generate [n/2]m equiprobable independent random 
bits (of entropy fi{nm)) from A by reporting 1 iff the answer to the query 
consisting of the two positions (2z — 1, j) and (2z, j) is (2i, j), for i = 1, . . . , \n/2\ , 
j = l,...,m. 

For the 3-sided case, recall that we focus on queries that are open to the "top" 
side. We again define the region of position as the area such that (i, j) is 
the answer to any query containing and lying entirely within that area. In 
contrast to the 4-sided case, the region of a point may be empty (if it is not a 
prefix maximum in its column). For points with non-empty regions, their region 
is delimited on the right by a monotone sequence of positions (fc, I) such that I > j 
for all positions, and k > Hot all positions but one (see Fig. 1). The left delimiters 
are symmetric, and the region is obviouly delimited from below by the next prefix 
maximum in the j-th column. To answer RMQs, we store all (ordered) pairs {p, q) 
such that position q delimits the region of position p (assuming p has a non-empty 
region). The pairs are stored sorted by p's column, and are represented as follows 
(all numbers are assumed stored in a self-delimiting manner, e.g. using Gamma 
codes). We store p and g's row number using O(logTO) bits and the difference 
between the columns of p and q using 0(1 -I- log(|j — l\ + 1)) bits. 

The pair (p, q) is stored iff the value in position q = (/c, /) is the largest and 
the value in position p = is the second largest in the sub-array defined 

by p and g, i.e. A[l ■ ■ ■ max{i, k}] [j ■ ■ ■ I] (assuming that I > j). For a fixed pair 
{p,q),p ^ q, the probability that {p,q) is stored is (ab)(a6-i) — 2/(a6)^, where 
a = max{i, k} and b = |j — / | -I- 1. We now calculate the expected cost of storing 
(p, q) over all pairs (p, q) , taking j — 1 io simplify the summation (arbitrary j 



will have a summation at most twice that of j = 1). 



i=l 1=1 fe=l ' \i=l 1=1 

= O I f^(logm)/i ) =0((logm)2). 



i=l 



The summation uses the observation that, for any fixed i, X^^Li (max{fc i})^ ~ 

X]fc=i F + X]feLi+i F" = ^(l/i). Summing over all j, we get that the expected 
effective entropy is 0(n(logm)^). For the lower bound, the n columns can be 
considered independent 1 x m prefix maxima problems each requiring expected 
i7(log^ m) bits by Theorem 3. □ 



3 Small m 



Brodal et al. [4] gave a 2D-RMQ encoding of size essentially (^^i^^-O(logm))- 

2n « n • m(m + 3) bits for a m x n array. In order that precise comparisons can 

be made for fixed values of m, we outline their approach. For each of the m 
rows of the matrix, they store a Cartesian tree for that row, and for each of 
the (m)(m — \)/2 possible subranges of rows, they store a Cartesian tree for 
for the maximum value in each column that lies within that set of rows. Since 
we consider m fixed in this section, the space bound for these Cartesian trees is 
((m)(m+l)/2)(2n— O(logn)) bits which is essentially 2n-{m){m+l) bits. Given 
any query spanning a subrange of rows, the Cartesian tree for that subrange tells 
us which column the range maximum lies in. However, to find which row the 
maximum lies, in Brodal et al. also store a Cartesian tree for each column of the 
matrix. The space used by these column-wise Cartesian trees needs to be cal- 
culated more carefully since m is small, and is taken to be n- log (^;;^^(^^)^ 

(we do not take the ceiling of the log since the Cartesian trees for all columns 
could be encoded together). Specifically this gives: 



m 


row-wise CT 


column-wise CT 


Total 


2 


6n — O(logn) 


n 


7n - O(logn) 


3 


12n- O(logn) 


n ■ log 5 


« 14.32n 


4 


20n- O(logn) 


n ■ log 14 


« 23.81n 



Furthermore, Brodal ct al. showed that the effective entropy of 2D-RMQ is at 



least log((^!)'^ 2 J) bits. For m = 2, their techniques give a lower bound of 
n/2, but this is worse than the obvious lower bound of 4n — O(logn) obtained 
by considering each row independently. 

In this section we improve upon these results for small m. Our main tool is 
the following lemma: 



Lemma 1. Let A be an arbitrary m x n array, m > 2. Given an encoding 
capable of answering range maximum queries of the form [1 • • • (m— 1)] x [ji ■ ■ ■ ,72] 
fl < ji < j2 <n) and an encoding answering range maximum queries on the last 
row of A, n additional bits are necessary and sufficient to construct an encoding 
answering queries of the form [1 • • • m] x \ji ■ ■ ■22] (I < ji < j2 < n) on A. 

Proof. The proof has two parts, one showing sufficiency (upper bound) and the 
other necessity (lower bound). 

Upper Bound. We construct a joint Cartesian tree that can be used in answering 
queries of the form [1 • • • m] x [ji ■ ■ ■ ,72] for 1 < ji < j2 < n, using an additional 
n bits. The root of the joint Cartesian tree is either the answer to the query 
[1 • • • (m — 1)] X [1 • • • n] or [m] x [1 • • • n]. We store a single bit indicating the 
larger of these two values. We now recurse on the portions of the array to the 
left and right of the column with the maximum storing a single bit, which indi- 
cates which sub-problem the winner comes from, at each level of the recursion. 
Following this procedure, using the n additional bits it created, we can construct 
the joint Cartesian tree. To answer queries of the form [1 • • • m] x [i - ■ ■ j], the 
lowest common ancestor x (in the joint Cartesian tree) of i and j gives us the 
column in which the maximum lies. However, the comparison that placed x at 
the root of its subtree also tells us if the maximum lies in the m-th row or in rows 
1 • • • m — 1; in the latter case, query the given data structure for rows 1 • • • m — 1. 

Lower Bound. For simplicity we consider the case m = 2 — it is easy to see that 
by considering the maxima of the first m — 1 elements of each cohimn the general 
problem can be reduced to that of an array with two rows. Let the elements of 
the top and bottom rows he ti,t2, . ■ . ,tn and 61, 62, • • • , Given two arbitrary 
Cartesian trees T and B that describe the answers to the top and bottom rows, 
the procedure described in the upper bound for constructing the Cartesian tree 
for the 2 X n array from T and B makes exactly n comparisons between some 
ti and bj . Let ci , . . . , c,i be a bit string that describes the outcomes of these 
comparisons in the order which they are made. We now show how to assign 
values to the top and bottom rows that are consistent with any given T, B, and 
comparison string ci,...,c„. Notice this is different from (and stronger than) 
the trivial observation that there exists a 2 x n array A such that merging T 
and B must use n comparisons: we show that T, B and the n bits to merge the 
two rows are independent components of the 2 x n problem. 

If the i-th comparison compares the maximum in ti-, . . . , t^ with the maxi- 
mum in 6;^, . . . , bn, say that the range [k, ri] is associated with the i-th compar- 
ison. The following properties of ranges follow directly from the algorithm for 
constructing the Cartesian tree for both rows: 

(a) for a fixed T and B, [Z^, r^] is uniquely determined by ci, . . . , Ci_i; 

(b) if j > i then the range associated with j is either contained in the range 
associated with i, or is disjoint from the range associated with i. 



By (a), given distinct bit strings ci, . . . , c„ and c'^, . . . , that differ for the 
first time in position i, the z-th comparison would be associated with the same 
interval [li,ri] in both cases, and the query [1..2] x [k.-ri] then gives different 
answers for the two bit strings. Thus, each bitstring gives distinguishable 2 x n 
arrays, and we now show that each bitstring gives valid 2 x n arrays. 

First note that if the i-th bit in a given bit string ci , . . . , c„ is associated 
with the interval [/,r], then it enforces the condition that tj > bk, where j = 
argmaXjgj; r]{^i} ^'^'^ ^ ~ ^-rgmax^gj; O'" vice- versa. Construct a digraph G 

with vertex set {ti,. . . , t„} U {bi, . . . , 6„} which contains all edges in T and B, 
as well as edges for conditions tj > bk (or vice versa) enforced by the bit string. 
All arcs are directed from the larger value to the smaller. We show that G is a 
DAG and therefore there is a partial order of the elements satisfying T and B 
as well as the constraints enforced by the bit string. 

Suppose that for some value of ci, . . . ,c„, G is not a DAG. Pick any cycle 
in G: there must be some node t £ {ti, . . . that is expHcitly enforced (i.e. 
by a comparison) to be greater than some b G {&i,...,6„}, such that some 
descendant b' of 6 in i? has been explicitly enforced to be greater than an ancestor 
t' of t (or the symmetric case with T and B interchanged must hold). Let the 
interval associated with the b-t comparison be [l,r]. First consider the case that 
b = b' . Since an clement that wins a comparison is never compared again, the 
comparison between b and t' must have occurred after the comparison with t, in 
which case the interval associated with the b-t' comparison is a sub-interval of 
[l,r] by property (b). This means that t' must be a descendant of t. Therefore 
b ^ b' and b' must be a proper descendant of b. If b' belongs to [/,r], it will 
never have been compared prior to the b-t comparison, and will subsequently 
only be compared (if at all) to a descendant of t. If b' does not belong to [Z,r], 
then there must have been a comparison between 6, or one of 6's ancestors 
in B, that was won by a proper ancestor t!' of t, such that the range [I" ,r"\ 
associated with that comparison was split into two parts, one containing [/, r] 
and one containing b. Clearly, b could not have been compared prior to this 
comparison, and subsequently, can only be compared to elements from T that 
are in a different subtree of T than t. □ 

Using Lemma 1 we show by induction: 

Theorem 6. There exists an encoding solving the 2D-RMQ problem on amy. n 
array requiring at most n ■ "'^'^"'"'^^ bits. 

Proof. The theorem follows by induction from Lemma 1 and the fact that 2n 
bits are sufficient to store a Cartesian tree of a 1 x //. array (the base case). 
Given an encoding solving the RMQ problem for a (m — 1) x n array (using 
(to — l)(m + 2)n/2 bits by induction) and a Cartesian tree for a ID array (using 
2n bits) we construct a solution to the 2D-RMQ problem on a to x n array 
combining the two (with the ID array as the last row of the combined array) 
by using Lemma 1 to construct to — 1 Cartesian trees answering queries of the 
form [i • • • to] x [j'l • • • ^'2] for 1 < i < to — 1 using (to — l)n additional bits. □ 



Theorem 7. The minimum space required for any encoding for the 2D-RMQ 
problem on a m x n array is at least n ■ (3m — 1) — 0(m log n) bits. 

Proof, s The result follows by induction from Lemma 1 and the fact that 2n — 

O(logn) bits arc required to solve the RMQ problem for a 1 x n array (the base 
case). Note that any encoding that solves the 2D RMQ problem for a m x n 
array must be able to solve the ID RMQ problem on its last row, the 2D RMQ 
problem on the array consisting of the first m — 1 rows as well as queries of 
the form [1 • • • m] x [ji---j2] (1 ^ Ji ^ j2 < n). The first two problems are 
entirely independent, i.e., answers to queries from one provide no information 
about the answers to the other. The ID problem requires 2n — C'(logn) bits and 
the additional queries require at least n bits by Lemma 1, i.e., 3n — O(logn) bits 
are needed on top of those required for solving the problem on the first m — 1 
rows. □ 

For fixed m < 2^^ this lower bound is better (in n) than that of Brodal et al. 
[4]. For the case m — 2 our bounds are tight: 

Corollary 1. 5n — O(logn) bits are necessary and sufficient for an encoding 
answering range maximum queries on a 2 x n array. 

For the case m = 3, our upper bound is 9n bits and our lower bound is 
8n — O(logn) bits. We can improve the upper bound slighty: 

Theorem 8. The 2D-RMQ problem can be solved using at most (6 + log5)n + 
o(n) « 8.322n bits on a3 x n array. 

Proof. Wc refer to the three rows of the array as T (top), M (middle) and B 
(bottom). We store Cartesian trees for each of the three rows (using 6n bits). We 
now show how to construct data structures for answering queries for TM (the 
array consisting of the top and middle rows), MB (the middle and bottom rows) 
and TMB (all three rows) using an additional n log 5+o(n) bits. Let < a; < 1 be 
the fraction of nodes in the Cartesian tree for TMB such that the maximum lies in 
the middle row. Given the trees for each row, and a sequence indicating for each 
node in the Cartesian tree for TMB in in-order, which row contains the maximum 
for that node, we can construct a data structure for TMB. The sequence of row 
maxima is coded using arithmetic coding, taking Pr[M] = x and Pr[B] = Pr[T] = 
(1 — x)/2; the output takes (—a; log a; — (1 — x) log((l — x)/2))n + o(n) bits [13]. 

We now apply the same procedure as in Lemma 1 to construct the Cartesian 
trees for TM and MB, storing a bit to answer whether the maximum is in the top 
or middle row for TM (middle or bottom row for MB) for each query made in the 
construction of the tree starting with the root. However, before comparing the 
maxima in T and M in some range, we check the answer in TMB for that range; if 
TMB reports the answer is in either T or M, we do not need to store a bit for that 
range for TM. It is easy to see that every maximum in TMB that comes from T 
or B saves one bit in TM or MB, and every maximum in TMB that comes from M 
saves one bit in both TM and MB. Thus, a total of 2n— {l — x)n~2xn = {l — x)n 
bits are needed for TM and MB. The total number of bits needed for all three 



trees (excluding the o(n) term) is (2(1 — x) — x log a; — (1 — x) log(l — x))n. This 
takes a maximum at x = 1/5 of n log 5. □ 



4 Data Structures for 2D-RMQ 

In this section, we give efficient data structures for 2D-RMQ. We begin by a 
recap of rank and select operations on bit strings. Given a bit string 5[l..f] of 
length t, define the following operations, for x € {0, 1}: 

— rsinkx{S,i) returns the number of occurrences of x in the prefix S'[l..z]. 

— se\ectx{S,i) returns the position of the zth occurrence of x in S. 

Such a data structure is called a fully indexable dictionary (FID) by Raman et 

al. [17], who show that: 

Theorem 9. There is a FID for a bit string S of size t using at most Ig (*) +o{t) 
bits, that supports all operations in 0(1) time on the RAM model with wordsize 
O(lgt) bits, where r is the number of Is in the bit string. 

Since Ig (*) < t, the space used by the FID is always t + o{t) bits. 

Theorem 10. There is a data structure for 2D-RMQ on a random mxn array 
A which answers queries in 0{1) time using 0{mn) expected bits of storage. 

Proof. Take N = mn and A = [2 log log N'\ + l, and define the label of an element 

z = A[i, j] as min{[log(l/(l — z))] , A} if z 7^ 0. The labels bucket the elements 
into A buckets of exponentially decreasing width. We store the following: 

(a) An mxn array L, where L[i,j] stores the label of A[i,j]. 

(b) For labels x = 1,2,..., A — 1, take r = 2^^ and partition A into r x r 
submatrices (called grid boxes or grid sub-boxes below) using a regular grid 
with lines r apart. Partition A four times, with the origin of the grid once 
each at (0,0), (0,r/2), (r/2,0) and (r/2,r/2). For each grid box, and for all 
elements labelled x in it, store their relative ranks within the grid box. 

(c) For all values with label A, store their global ranks in the entire array. 

The query algorithm is as follows: 

1. Find an element with the largest label in the query rectangle. 

2. If the query contains an element with label A, or if the maximum label is x 
and the query fits into a grid box at the granularity associated with label x, 
then use (c) or (b) respectively to answer the query. 

A query fails if the maximal label in the query rectangle is a; < A but the 
rectangle does not fit into any grid box associated with label x. For this case: 



(d) Explicitly store the answer for all queries that fail. 



We now give an efficient implementation of the above, and prove the stated space 
bounds. 

The data structures to support steps (1) and (2) also store (a) and (b) in 
0{N) bits. Firstly, L is represented as a bit string that represents the concate- 
nation of all elements of L in row-major order, with a x encoded in unary as 1^0. 
Since the expected number of nodes with label > a; is 0{N/2'^), it follows that 
the expected size of this encoding is 0{N) bits. To access L[i,j] in 0(1) time, 
we store this bit string as an FID (Theorem 9) and use the selecto operation on 
this bit string. Furthermore, we use Brodal ct al.'s "hybrid" 2D-RMQ indexing 
structure [4, Theorem 3] over the array L: this data structure stores an index of 
0{N) bits, and answers queries in 0(1) time, using 0(1) comparisons between 
elements of L. The comparisons are implemented by accessing L and breaking 
ties arbitrarily, implementing step (1). 

However, we cannot use the above approach to implement step (2), since the 
problem now focusses on a sparse set of points with label x. Hence, we use a 
data structure to solve the following problem: for each label value x < \, answer 
range maximum queries which lie entirely in a grid box of size r x r where 
r = 2^^. We allow the data structiuc for a given grid box to use 0{{t + l)x) 
bits of space, where t is the number of elements in the box that have label x. 
Note, however, that x < 2 log log TV + 1 so r = 0((log A^)"'). For any grid box 
where < c log iV/ log log TV for some sufficiently small constant c > 0, this can 
be done by table lookup, since we can write down all the coordinates as well as 
the relative priorities in as a bit string of fewer than (logiV)/2 bits: the required 
table will be of size at most 0{-/N) words, or o{N) bits. The space used per 
grid block is clearly 0{tx) since the bit string comprises just the positions of 
the points and their relative priorities, all of which take 0{x) bits. The expected 
space used across all grid blocks for label x is at most 0{xN/2^) bits, summing 
up to 0{N) expected bits overall. 

For larger values of x we use a data structure that takes o(A^/loglog A^) 
expected bits for each value of x, but since there are 0(loglogA'') values of x 
this is still o{N) bits overall. We divide each grid box into grid sub-boxes each 
of side r', where r' = 2^^ for the largest y such that (r')^ < c log iV/ log log A''. 
The number of sub-boxes is N/ (r')^ , or 0{N log log N/ log N) . For this grid box, 
we store a matrix R which is {r/r') x r where each entry corresponds to a row 
of a sub-box, and contains the largest relative rank in this row of this sub-box. 
The space used by R is 0{{N/r')loglogN) = o{N / loglog N) . Using Brodal et 
al's data structure we can do 2D-RMQ queries on R. We also create a r x {r/r') 
matrix C where each entry corresponds to a column of a sub-box, and similarly 
we can do 2D-RMQ queries on C. A general query can either be decomposed 
into 0(1) queries on R and C, plus 0(1) 2-sided or 3-sided queries each on one 
sub-block, or is a 4-sided query completely contained in a sub-block, each of 
which is done by table lookup. This implements step (2) in 0(1) time using 
0{N) bits. A very similar data structure is used to handle elements with label 
A. We divide the input matrix A into sub-boxes of size logiV x logiV, and create 
matrices R' and C which represent elements with label A in each row/column 



respectively. We store R' and C explicitly using 0{N) bits each and store a 
2D-RMQ indexing structure on R' and C . To answer queries inside a sub-block, 
we use an algorithm quite similar to that used for the smaller labels. 

We finally need to bound the space usage in (d), and also to describe how to 
represent the solutions. We classify queries based upon their area, i.e. the number 
of positions they contain. For a given value of the area A, there are at most A 
(in fact, considerably fewer) different aspect ratios that give rise to that area, 
and at most N queries with that aspect ratio, or NA queries in all. To encode 
the maximum in a query with area A requires O(logA) bits. The smallest grid 
granularity that will contain all queries of area A is the granularity associated 
with label x = [(log A)/2] . If a query of area A contains no positions with labels 
> X, it may fail. The probability of this happening is at most (1 — 2~^^^)"^ = 
2-(1{-\/a) ^ so the expected number of failing queries of area A is 0{N/A'^). For 
each area A < (log A'')^, we store a minimal perfect hash function from [1- ■ ■ N]x 
[1 • • • ^] [1..Na], where Na is the number of failing queries of area A (the 
domain of the hash function specifies the top left corner and, say, the width of the 
query) . Such hash functions can be stored in 0{NA+log log N) bits and evaluated 
in 0(1) time [21], and are used to index into an array of length Na that contains 

the answer to that query. The space used is J2a'=i^^ 0{Na ■ log A + log log A^) 
bits, and since E(Af^) = 0{N/A^), by linearity of expectation, the expected 
space used is 0{N). □ 

We now show how to support 2D-RMQ queries efficiently on a 2 x n array, using 
space close to that of Corollary 1. In the following data structure, we use a 
Cartesian tree augmented with additional leaves, which we call as an augmented 
Cartesian tree or ACT for short. The ACT of a given array A is the Cartesian 
tree in which every node is augmented with a leaf in between its left and right 
children. If a node has only a left child then we add the leaf as its second child, 
and if it only has a right child, then wc add the leaf as its first child. Finally, if a 
node (in the Cartesian tree) is a leaf, then we add the new leaf as its child. This 
structure was used by Sadakane [18] to obtain a space-efRcient data structure 
supporting RMQ queries. The indices of the array A correspond to the inorder 
numbers of the nodes in the Cartesian tree, and to the preorder numbers of the 
leaves in the ACT. 

Theorem 11. There is a data structure for 2D-RMQ on an arbitrary 2 x n 
array A, which answers queries in 0{k) time using 5n + 0{nlogk/k) bits of 
space, for any parameter k = (logn)'^(^). 

Proof. For an integer array of length n, the 2D-m,axHeap^^ of Fischer and Heun [9] 
uses 2n+ 0(n log log n/ log n) bits and supports RMQ queries on the array in 
0(1) time. The lower order term can be further reduced to 0(n/(log n)'^^^)) using 
the tree representation of Sadakane and Navarro [20]. We use this 2D-maxIIcap 
structure (which is essentially a space- and query-efficient representation of a 



Fischer and Heun [9] use the term 2D-minHeap as they consider the problem of 
answering range minimum queries 



Cartesian tree) to support queries on each of the individual rows, using a total 
of 4n + o(n) bits. The upper bound of Lemma 1 shows how to combine these 
two Cartesian trees (2D-maxHeaps) using n bits, to answer queries involving 
both the rows. We refer to these n bits as the bit vector M (that merges the 
two Cartesian trees). Each of these n bits in M corresponds to a unique column 
in A, and the bits in M are written in the order of the inorder numbers of the 
nodes in the joint Cartesian tree. These bits can be decoded in that order in 
0(1) time per bit, using the Cartesian trees for the individual rows, to recon- 
struct the joint Cartesian tree. A query involving both the rows can be answered 
by decoding the first bit that falls within the query range. Thus the worst-case 
query complexity is 0{n) for this representation. 

If we represent the joint Cartesian tree for both the rows as a 2D-maxHeap, 
and an additional n bits to indicate the column maxima, we can answer any 
RMQ query in 0(1) time using a total of 7n + o(n) bits. We now describe how 
to reduce the space to achieve the trade-off described in the statement of the 
theorem. 

The main idea is to represent the ACT of the joint Cartesian tree using a 
succinct tree representation based on tree decomposition. The representation 
decomposes the ACT into 0{n/k) microtrees, each of size at most fc, and repre- 
sents the microtrees using a total of 4n + o{n) bits (as the ACT has 2n nodes), 
and stores several auxiliary structures of total size 0{nlogk/k) bits. It supports 
various queries (such as LCA) in constant time by accessing a constant num- 
ber of microtrees and reading a constant number of words from the auxiliary 
structures. Instead of storing the representations of microtrees, we show how 
reconstruct any microtree in 0{k) time (in fact, time proportional to its size) 
using the bit vector M (together with additional auxiliary structures of size 
0(nlog k/k) bits). The new representation consists of the 2D-maxHeap for both 
the rows (4n + o(n) bits) and the bit vector M that 'merges' these two trees 
(n bits) in addition to these auxiliary structures. Thus the representation uses 
5n + 0{n\ogk/k) bits overall, and supports RMQ queries in 0{k) time. We now 
describe this in detail. 

We take the ACT of the joint Cartesian tree and partition it into 0{n/k) 
microtrees, each of size at most fc, for some paramater fc > 2, using the tree 
decomposition algorithm of Farzan and Munro [7] . The microtrees produced by 
the decomposition algorithm have the property that for each microtree, there is 
at most one node such that one of its children is the root of another microtree. In 
addition, two microtrees can share a common root. We modify the decomposition 
so that whenever a node x is the root of two microtrees (a node cannot be the 
root of more than two microtrees, as we have a ternary tree in which one of 
the children of every internal node is a leaf), we remove the node x from both 
the microtrees, and make another microtree containing x and its second child 
(which is a leaf). One can show that the number of microtrees produced by the 
modified decomposition is still 0{n/k). Now, no two microtrees share a node, 
and thus we obtain a partition of the ACT into microtrees. 



Each leaf in the ACT corresponds to a column in the array A, and as men- 
tioned earlier, the leaves of the ACT in preorder correspond to the columns of 
the array A from left to right. The above partitioning procedure splits the ACT 
such that the columns corresponding the all the nodes in a microtree are in at 
most two consecutive chunks in the array A (as all the nodes in a microtree 
have consecutive preorder numbers, except when a node has a child outside the 
microtree; and since there is at most one such node, the claim follows). 

We store the auxiliary structures to support LCA and rank/select on leaves in 

preorder, using 0{n log fc/fc) bits. As the ACT has 2n nodes, we need 4n + o(n) 
bits to store the representations of all the microtrees. Instead of storing the 
representations of the microtrees, we show how to reconstruct the representation 
of any microtree in time proportional to its size by simply storing the bit vector 
M. In addition, we also store auxiliary structures to represent the correspondence 
between the microtrees and the positions in the array, as explained below. 

We construct two bit vectors Bi and B2 of length n + 0{n/k) as follows: we 
initialize both arrays with n zeroes each. We insert a 1 after the i-th zero in Bi 
{B2) if position i (i — 1) is the starting (ending) position of a chunk in A. We 
store these two bit vectors using the FID structure of Theorem 9, which takes 
0(n log fc/fc) bits of space. Now, considering the Is in Bi as open parentheses 
and the Is in B2 as close parentheses, we 'merge' the two parenthesis sequences 
to obtain a balanced parenthesis sequence that represents the tree structure 
of the microtrees. We store this sequence along with an auxiliary structure to 
support parenthesis operations, such as find-open, find-close, excess, rank-open, 
rank-close [11]. Using all these data structures, we can support the following op- 
erations: (i) number the microtrees in the sorted order of their preorder number 
of the leftmost leaf, (ii) find the microtree that contains the leaf corresponding 
to a given column index in A, and (iii) find the chunk corresponding to a given 
microtree. 

For each microtree /x, we store the bits of M corresponding to the leaves of 
fj, in the order of their preorder numbers. Given these bits, we now show how 
to reconstruct /i, in time linear in the size of ^. Suppose Ci and Cr are the 
two chunks corresponding to /i (where Cr is empty if the nodes in ^ correspond 
to a single chunk). We first observe that all the elements that lie between the 
chunks Ci and Cr in A are strictly smaller than the maximum element in the 
last column of C^ as well as the the maximum element in the first column of Cr ■ 
Thus any RMQ query whose one end point lies in Ce and the other end point 
lies in Cr has its answer in one these two chunks and never in between these 
chunks. 

Let i be the first column of Q and j be the last column of Cr- We first 
find t = RMQ{A,[1] x and b = RMQ{A,[2] x which return the 

positions of the maximum elements in the range [i-.j] in the top and bottom 
rows respectively. The first bit of M that we store for the microtree is a or 1 
depending on whether is greater or less than A[2,6]. Suppose that A[l,t] 

is the larger of the two (the other case is similar). Then the root of /U has a 
leaf child which corresponds to the position t. The leftmost (first) and rightmost 



(third) subtrees of the root correspond to all the positions in the chunks Cg, 
and Cr that are to the left and right of the position t respectively, and can be 
constructed recursively using the same procedure. Since each node of /z can be 
'constructed' in 0(1) time, the overall time to reconstruct ji is 0(|/u|), where 
denotes the size of ^. 

To answer an RMQ query that spans both the rows, we use an algorithm 
that answers a query by storing the tree decomposition representation. When- 
ever we need to access a microtree, wc; reconstruct it using the above procedure, 
in 0{k) time; all the auxiliary structures are stored explicitly, and hence can be 
accessed in 0(1) time. Since the query algorithm accesses 0(1) microtrees, the 
overall running time is 0{k). Finally, the lower order terms that arise in various 
substructures above, such as FIDs and auxiliary structures for tree representa- 
tions, which are independent of k can be made 0{n/ (logn)*^*^^') using the ideas 
from [16]. Thus the overall space used is 5n -|- 0{nlogk/k) bits, for any param- 
eter fc = (log n)°(^). □ 

5 Conclusions and Open Problems 

Wc have given new explicit encodings for the RMQ problem, as well as (in some 
cases) efficient data structures whose space usage is close to the sizes of the 
encodings. We have focused on the cases of random matrices (which may have 
relevance to practical applications such as OLAP cubes [5]) and the case of small 
values of to. Obviously, the problem of determining the asymptotic complexity 
of encoding RMQ for general m remains open. 

References 

1. A. Amir, J. Fischer, and M. Lewenstein. Two-dimensional range minimum queries. 

In Proc. 18th Annual Symposium on Combinatorial Pattern Matching, volume 4580 
of LNCS, pages 286-294. Springer- Verlag, 2007. 

2. M. J. Atallah and H. Yuan. Data structures for range minimum queries in mul- 
tidimensional arrays. In Proc. 20th Annual ACM-SIAM Symposium on Discrete 
Algorithms, pages 150-160. SIAM, 2010. 

,3. D. Bcnoit, E. D. Dcmainc, ,1. I. Munro, R. Raman, V. Raman, and S. S. Rao. 
Representing trees of higher degree. Algorithmica, 43(4):275-292, 2005. 

4. G. S. Brodal, P. Davoodi, and S. S. Rao. On Space Efficient Two Dimensional 
Range Minimum Data Structures. In Proc. of European Symposium on Algorithms, 
volume 6347 of Lecture Notes in Computer Science, pages 171-182. Springer, 2010. 

5. S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technol- 
ogy. SIGMOD Rec, 26:65-74, March 1997. 

6. E. D. Demaine, G. M. Landau, and O. Weimann. On cartesian trees and range 
minimum queries. In Proc. 36th International Colloquium on Automata, Languages 
and Programming, volume 5555 of LNCS, pages 341-353. Springer- Verlag, 2009. 

7. A. Farzan and J. I. Munro. A uniform approach towards succinct representation 
of trees. In J. Gudmundsson, editor, SWAT, volume 5124 of Lecture Notes in 
Computer Science, pages 173-184. Springer, 2008. 



8. P. Ferragina and G. Manzini. Indexing compressed text. JACM, 52:552-581, 2005. 

9. J. Fischer and V. Hcun. Spacc-cfEcicnt preprocessing schemes for range minimum 
queries on static arrays. SIAM J. Comput., 40(2):465-492, 2011. 

10. H. N. Gabow, J. L. Bentley, and R. E. Tarjan. Scaling and related techniques 
for geometry problems. In Proc. 16th Annual ACM Symposium on Theory of 
Computing, pages 135-143. ACM, 1984. 

11. R. F. Geary, N. Rahman, R. Raman, and V. Raman. A simple optimal represen- 
tation for balanced parentheses. Theor. Comput. Sci., 368(3) :231-246, 2006. 

12. R. Grossi and J. S. Vitter. Compressed sufHx arrays and suffix trees with applica- 
tions to text indexing and string matching. SICOMP, 35(2):378-407, 2005. 

13. P. G. Howard and J. S. Vitter. Arithmetic coding for data compression. In M.-Y. 
Kao, editor, Encyclopedia of Algorithms. Springer, 2008. 

14. G. Jacobson. Space-efficient static trees and graphs. In FOCS, pages 549-554. 
IEEE, 1989. 

15. J. C. Kicffer, E.-H. Yang, and W. Szpankowski. Structural complexity of random 
binary trees. In Proc. IEEE International Symposium on Information Theory 
(ISIT), pages 635-639, 2009. 

16. M. Patrascu. Succincter. In FOCS, pages 305-313. IEEE Computer Society, 2008. 

17. R. Raman, V. Raman, and S. S. Rao. Succinct indexable dictionaries with appli- 
cations to encoding k-ary trees and multisets. In SODA, pages 233-242, 2002. 

18. K. Sadakane. Space-efficient data structures for flexible text retrieval systems. In 
P. Bose and P. Morin, editors, ISAAC, volume 2518 of Lecture Notes in Computer 
Science, pages 14-24. Springer, 2002. 

19. K. Sadakane. Compressed suffix trees with full functionality. Theory of Computing 
Systems, 41(4):589 607, 2007. 

20. K. Sadakane and G. Navarro. Fully-functional succinct trees. In M. Charikar, 
editor, SODA, pages 134-149. SIAM, 2010. 

21. J. P. Schmidt and A. Siegel. The spatial complexity of oblivious k-probe hash 
functions. SIAM J. Comput, 19(5):775-786, 1990. 

22. J. Vuillemin. A unifying look at data structures. Communications of the ACM, 
23(4):229-239, 1980. 



